379 research outputs found

    Discovering Restricted Regular Expressions with Interleaving

    Full text link
    Discovering a concise schema from given XML documents is an important problem in XML applications. In this paper, we focus on the problem of learning an unordered schema from a given set of XML examples, which is actually a problem of learning a restricted regular expression with interleaving using positive example strings. Schemas with interleaving could present meaningful knowledge that cannot be disclosed by previous inference techniques. Moreover, inference of the minimal schema with interleaving is challenging. The problem of finding a minimal schema with interleaving is shown to be NP-hard. Therefore, we develop an approximation algorithm and a heuristic solution to tackle the problem using techniques different from known inference algorithms. We do experiments on real-world data sets to demonstrate the effectiveness of our approaches. Our heuristic algorithm is shown to produce results that are very close to optimal.Comment: 12 page

    Active learning of group-structured environments

    Get PDF
    The question investigated in this paper is to what extent an input representation influences the success of learning, in particular from the point of view of analyzing agents that can interact with their environment. We investigate learning environments that have a group structure. We introduce a learning model in different variants and study under which circumstances group structures can be learned efficiently from experimenting with group generators (actions). Negative results are presented, even without efficiency constraints, for rather general classes of groups showing that even with group structure, learning an environment from partial information is far from trivial. However, positive results for special subclasses of Abelian groups turn out to be a good starting point for the design of efficient learning algorithms based on structured representations

    Learning stochastic finite automata from experts

    Full text link

    Searching for Leptoquarks in electron-photon Collisions

    Full text link
    We study the production of composite scalar leptoquarks in eγe\gamma colliders, and we show that an e+ee^+e^- machine operating in its eγe\gamma mode is the best way to look for these particles in e+ee^+e^- collisions, due to the hadronic content of the photon.Comment: 12 pages in REVTeX3. 6 figures appended as postcript files. Report: IFT-P.014/93 and IFUSP-P 104

    Hitting all Maximal Independent Sets of a Bipartite Graph

    Full text link
    We prove that given a bipartite graph G with vertex set V and an integer k, deciding whether there exists a subset of V of size k hitting all maximal independent sets of G is complete for the class Sigma_2^P.Comment: v3: minor chang

    Variable length-based genetic representation to automatically evolve wrappers

    Full text link
    The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-642-12433-4_44Proceedings 8th International Conference on Practical Applications of Agents and Multiagent SystemsThe Web has been the star service on the Internet, however the outsized information available and its decentralized nature has originated an intrinsic difficulty to locate, extract and compose information. An automatic approach is required to handle with this huge amount of data. In this paper we present a machine learning algorithm based on Genetic Algorithms which generates a set of complex wrappers, able to extract information from theWeb. The paper presents the experimental evaluation of these wrappers over a set of basic data sets.This work has been partially supported by the Spanish Ministry of Science and Innovation under the projects Castilla-La Mancha project PEII09-0266-6640, COMPUBIODIVE (TIN2007-65989), and by V-LeaF (TIN2008-02729-E/TIN)

    Mining State-Based Models from Proof Corpora

    Full text link
    Interactive theorem provers have been used extensively to reason about various software/hardware systems and mathematical theorems. The key challenge when using an interactive prover is finding a suitable sequence of proof steps that will lead to a successful proof requires a significant amount of human intervention. This paper presents an automated technique that takes as input examples of successful proofs and infers an Extended Finite State Machine as output. This can in turn be used to generate proofs of new conjectures. Our preliminary experiments show that the inferred models are generally accurate (contain few false-positive sequences) and that representing existing proofs in such a way can be very useful when guiding new ones.Comment: To Appear at Conferences on Intelligent Computer Mathematics 201

    A case study on grammatical-based representation for regular expression evolution

    Full text link
    The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-642-12433-4_45Proceedings of 8th International Conference on Practical Applications of Agents and Multiagent SystemsRegular expressions, or simply regex, have been widely used as a powerful pattern matching and text extractor tool through decades. Although they provide a powerful and flexible notation to define and retrieve patterns from text, the syntax and the grammatical rules of these regex notations are not easy to use, and even to understand. Any regex can be represented as a Deterministic or Non-Deterministic Finite Automata; so it is possible to design a representation to automatically build a regex, and a optimization algorithm able to find the best regex in terms of complexity. This paper introduces both, a graph-based representation for regex, and a particular heuristic-based evolutionary computing algorithm based on grammatical features from this language in a particular data extraction problem.This work has been partially supported by the Spanish Ministry of Science and Innovation under the projects Castilla-La Mancha project PEII09-0266-6640, COMPUBIODIVE (TIN2007-65989), and by HADA (TIN2007-64718)

    Learning Rational Functions

    Get PDF
    International audienceRational functions are transformations from words to words that can be defined by string transducers. Rational functions are also captured by deterministic string transducers with lookahead. We show for the first time that the class of rational functions can be learned in the limit with polynomial time and data, when represented by string transducers with lookahead in the diagonal-minimal normal form that we introduce
    corecore